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ABSTRACT : The ability to obtain accurate predictions of bus arrival time on a real time basis is vital to both bus 
operations control and passenger information systems. Several studies have been devoted to this arrival time prediction 
problem in many countries; however, few resulted in completely satisfactory algorithms. This paper presents an effective 
method that can be used to predict the expected bus arrival time at individual bus stops along a service route. This method is 
a hybrid scheme that combines a neural network (NN) that infers decision rules from historical data with Kalman filter (KF) 
that fuses prediction calculations with current GPS measurements. The proposed algorithm relies on real-time location data 
and takes into account historical travel times as well as temporal and spatial variations of traffic conditions. A case study on 
a real bus route is conducted to evaluate the performance of the proposed algorithm in terms of prediction accuracy. The 
results indicate that the system is capable of achieving satisfactory performance and accuracy in predicting bus arrival 
times for Egyptian environments. 
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I. INTRODUCTION 

Traffic plays an important role in modern urban society. Because of the limitation of the traffic resources, those 
increments will lead to urban traffic congestion. In order to relieve the congestion, the governments all around the world 
provide funding and support to develop public traffic systems and build traffic applications, such as subway system, signal 
control systems, traffic information management system, electronic toll collection systems, etc. which fall under intelligent 
transportation system. 

This paper concern is the part of research project called Transportation Management and User Awareness (TMUA) 
that research project is financially supported by NTRA to interconnect public transportation vehicles and bus stations with a 
central office to monitor the underlying vehicles. Based on the collected data and by analyzing road condition, accurate 
arrival times could be computed and transmitted to all relevant stations. Waiting time for the next bus(s) to arrive will be 
announced on screens and via audio speakers (in Arabic) to commuters on the bus station. Passengers in buses will be 
notified of the next bus stop using visual and audio announcements. Achieving these main features will cause major 
improvement in public transport convenience and safety. 

It will also allow the central offices to manage effectively their resources (mainly busses) through better route 
planning in relation to peak hours and congested zones. In the proposed system, virtually all data that are collected and 
stored are multi-dimensional. Typically, ranges of features are measured at a particular time or condition and stored as a 
complex data object. The data come in the form of a vector of real values. Different approaches use the spread of the data to 
suggest a new basis by choosing the directions that maximize the variance of the observations. If there are significant 
correlations between the different features, the number of features required to capture the data will be decreased. 

Arrival-time calculation depends on vehicle speed, traffic flow and occupancy, which are highly sensitive to 
weather conditions and traffic incidents. These features make travel-time predictions very complex and difficult to reach 
optimal accuracy. Nonetheless, daily, weekly and seasonal patterns can still be observed at a large scale. For instance, daily 
patterns distinguish rush hour and late night traffic, weekly patterns distinguish weekday and weekend traffic, while seasonal 
patterns distinguish winter and summer traffic. The time-varying feature germane to traffic behavior is the key to travel-time 
modeling. 

This research will assess what traffic, transit and freight data are available today from various sources, and consider 
how to integrate data from busses acting as "probes" in the system. Some obvious information is obtained easily though 
traditional query operation from traffic database, but deeper information that hides in the traffic database is difficult to be 
discovered. Deep level information usually contains characteristics of data and forecast information of data development 
tendency. 

Therefore, we are concerned with how to develop a powerful data-mining algorithm and apply it on the available 
data. In addition, a model-based predictor based on implementing a Kalman filter could be employed when the data-mining 
algorithm might be failed. In this paper, we propose hybrid Neural Network and Kalman filter Techniques to predict the bus 
arrival time. This paper is organized as follows, literature review presented in Section 2. Bus arrival time prediction methods 
are illustrated in Section 3. The proposed scheme for time prediction is presented in Section 4. Simulation results and 
discussions are given in Section 5 and finally conclusions are drawn in Section 6. 



www.ijmer.com 



2035 I Page 



International Journal of Modern Engineering Research (IJMER) 
www.iimer.com Vol. 3, Issue. 4, Jul - Aug. 2013 pp-2035-2041 ISSN: 2249-6645 

II. Literature Reviews 

In this section, some relative previous works related with bus travel time prediction are summarized. The main idea 
of the time prediction is based on the fact that traffic behaviors possess both partially deterministic and partially chaotic 
properties. Forecasting results can be obtained by reconstructing the deterministic traffic motion and predicting the random 
behaviors caused by unanticipated factors. Suppose that currently it is time t. Given the historical data f(t-l), f(t-2),..., and 
f(t-n) at time t-1, t-2,..., t-n, we can predict the future value of f(t+l), f(t+2), ... by analyzing historical data set. Hence, 
future values can be forecasted based on the correlation between the time -variant historical data set and its outcomes. The 
bus arrival time prediction models can be classified into the following three main items: mathematical algorithms (Historical 
Approach, Real-Time Approach and Statistical Models ), Kalman Filter model with historical data, and Artificial Neural 
Network(ANN) model which will discussed in the next section. 

In 1999, Lin and Zeng developed a mathematical algorithm to provide real-time bus arrival information [1]. They 
considered schedule information, bus location data, the difference between scheduled and actual arrival time, and waiting 
time at time-check stops in their algorithm. Their algorithm could not consider traffic congestion and dwell time at bus stops. 
At the same year, Ojili developed a bus arrival time notification system in College Station [2]. The model breaks the bus 
route into one-minute time zones. The bus arrival time at a given stop was predicted by counting the estimated number of the 
one-minute time zones between current location and the given stop. The model had the same issues as Lin and Zeng's. Also, 
it does not consider the traffic congestion and dwell time at bus stops. 

Wall and Dailey are the first authors who use Kalman Filter model to predict bus arrival time [3]. In their algorithm, 
they used a combination of both global position system(GPS) data and historical data, they used a Kalman filter model to 
track a vehicle location and used a statistical estimation technique to predict travel time. It was found that they could predict 
bus arrival time with less than 12% error. However, they did not explicitly deal with dwell time as an independent variable. 
In 2003, Shalaby and Farhan proposed another bus travel time prediction model by using Kalman filtering technique [4]. 
In the model, they considered the passenger information at each bus stop. However they predicted dwell time only at time 
check points, not at every bus stop. Due to the capability to solve complex non-linear relationships, artificial neural network 
model (ANN) had been used to model the transportation problems. The models had shown better results than those of 
existing. In 2002, Chien et al developed an artificial neural network model to predict dynamic bus arrival time in New 
Jersey. Considering the back-propagation algorithm is unsuitable for on-line application, the authors developed an 
adjustment factor to modify their travel time prediction by using recent observed real-time data. However the dwell time and 
scheduled data were not considered in their model [5]. 

In 2004, Jeong and Rilett provided a historical data based model, regression models and Artificial Neural Network 
(ANN) models to predict bus travel time by considering traffic congestion, schedule adherence and dwell times at stops [6]. 
In 2006, Ramakrishna et al proposed a multiple linear regression and an ANN model to predict bus travel times. In their 
model they considered real time GPS data of bus locations [7]. 

In 2009, Suwardo et al. proposed a statistical neural network model to predict the bus travel time in mixed traffic, 
while considering bus travel time, distance, average speed, number of bus stop, and traffic conditions. In their paper, they 
assessed those factors and studied the relationship mode between the factors and bus travel time [8]. 

In 2011, Feng Li et al. proposed a statistical model to predict the bus arrival time based on proposed linear [9]. In their 
paper, they had considered all of evaluated factors, such as departure time, driver characteristics, dwell time, intersections, 
traffic conditions etc. 

Among the above models, artificial neural network model and statistical neural network model have shown 
advantage than other models, such as Kalman Filter model, historical average model, auto-regressive integrated moving 
average (ARIMA) model and exponential smoothing model. However the parameters for those models are hard to determine 
because it need more historical data and will cost us more time. Although the models can provide relative bus arrival 
prediction time, but it is hard for us to explain the mechanism for them. Because of those reasons, in this paper we will 
provide a Hybrid Kalman filter with Neural network approach to forecast bus arrival time. 

III. Bus Arrival Time Prediction methods: 

Bus arrival time prediction has been studied by many in recent years, different approaches where studied for time 
prediction such as: 

Historical Approach: Predicts the travel time at a particular time as the average travel time for the same period over 
different days. 

Real-Time Approach: Predicts the travel time at the next time interval to be the same as that in the present time interval, 
this approach assumes that the bus travel time trend fluctuate within a narrow range which is impossible for actual traffic 
trend, such as incidents, congestion and other unpredictable traffic conditions. 

Statistical Models: Predict the bus arrival time based on a function formed by a set of independent variables. 
Model-Based Approaches: The Kalman Filter algorithm outperformed all other developed models in terms of accuracy, 
demonstrating the dynamic ability to update itself based on new data that reflected the changing characteristics of the transit- 
operating environment. So that algorithm was used to update the state variable (travel time) continuously as new 
observations became available. 

Machine Learning Techniques: Artificial neural network (ANN) is one of the most commonly reported techniques for 
traffic prediction mainly because of their ability to solve complex non-linear relationships [10]. Table 1 shows a comparison 
between different time prediction techniques and summarizes the approaches of estimating the bus travel time. 
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Table 1: Comparison between different time prediction techniques 



Technique 


Remarks 


Delay 
considered 


Historical approach 


Predict the travel time at particular time as the average travel 
nine lor uie same periou Historically 


no 


Real-time approach 


Assume the future travel time to be the same as the present 

one 


no 


Time series analysis 
approach 


Assume the historical patterns will remain same in the future 


no 


Statistical models 


Predict the dependent variable based on a function formed by 

a set of independent variables 


Yes, 


Machine learning 
techniques 


Prediction based on example data .Need large database for an 

accurate prediction 


no 


Model-based 
approaches 
(Kalman filter) 


Establish relationships between the variables and then 
corroborates using field observation .Not site specific or data 

specific 


yes 



IV. The Proposed Prediction Time Method 

In the proposed system, two models are suggested for bus arrival time prediction: 

1- Machine Learning technique (ANN) for off line estimation using previously collected data from traffic database. 

2- Model- based approach (Kalman filter) for online calculations in case of wide deviation between offline 
estimation and real time data (special cases). Figure 1 shows the preliminary flowchart of the proposed algorithm. 
In what follows the basic steps of the underlying algorithm are explained: 

V. Data Collection 

In the proposed system, virtually all data that are collected and stored are multi -dimensional. Typically, a range of 
features is measured at a particular time or condition and stored as a complex data object. The data comes in the form of a 
vector of real values. 



Receiving the packets From 
1 1 < • i < 1 -w a 1 » ■ module 

1 

Manipulate and store the 
received data 



Use modeling technique 
(Kalman fill »-i > to calculate 
the estimated arrival time 



Perform the online 
prediction using AJNTlNr 
to predict the 
estimated arrival 



Announce the estimated arrival 
time value 




Figure 1 : Flowchart of the proposed prediction time algorithm 

VI. Proposed Neural Network 

Neural networks are statistical models of real world systems, which are built by tuning a set of parameters. These 
parameters are seen as inputs to an associated set of values: the outputs. The process of tuning the weights to the correct 
values - training - is carried out by passing a set of examples of input-output pairs through the model and adjusting the 
weights in order to minimize the error between the answer the network gives and the desired output. Once the weights have 
been set, the model is able to produce answers for input values, which were not included in the training data [11, 12]. The 
used neural network, Figure 2, consist of four layers: input, two hidden, and output layer. The Input Layer of the proposed 
neural network has seven nodes. In this configuration, a double hidden layer is used. The first hidden layer has 10 nodes and 
the second hidden layer has 3 nodes while the output layer has only one node. 
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Weather 

Avg. Speed 
Traffic status 



Tr avdiri g 
Time 




Figure 2: Proposed neural network structure 



VII. PROPOSED KALMAN FILTER PREDICTOR 

The Kalman filter, also known as linear quadratic estimation (LQE), is an algorithm which uses a series of 
measurements observed over time, containing noise (random variations) and other inaccuracies, and produces estimates of 
unknown variables that tend to be more precise than those that would be based on a single measurement alone. More 
formally, the Kalman filter operates recursively on streams of noisy input data to produce a statistically optimal estimate of 
the underlying state. The Kalman filter uses a system's dynamics model (e.g., physical laws of motion), known control inputs 
to that system, and multiple sequential measurements (such as from sensors) to form an estimate of the system's varying 
quantities (its state ) that is better than the estimate obtained by using any one measurement alone. 

The Kalman filter estimates a process by using a form of feedback control: the filter estimates the process state at some time 
and then obtains feedback in the form of (noisy) measurements. As such, the equations for the Kalman filter fall into two 
groups: time update equations and measurement update equations. The time update equations are responsible for projecting 
forward (in time) the current state and error co variance. Figure 3 illustrate Kalman filter operations. 



Time L'pdate 
(Predict") 



Measurement Update 
("Conect") 



Figure 3: Proposed Kalman Filter structure 



The modified Kalman Filter algorithm used in the current research project, the last three similar days in the last 
three weeks historical data of actual running times between links at the instant k+1 and the last running time observation at 
the instant (k) on the last day are used to predict the bus running time at the instant (k+1). 
The Kalman Filter equations that are used for time prediction are: 



' J "e(k) + 2 VARllocaWJ 
a (k + 1) = ] - g(k + l) 



efk +1J = VAR[l«3U„ a ] -Mk + 1) 

P(k + 1) = a(k + 1) air(k) + ^(k + I) an j(k + L) 

Where: 

g = filter gain, a = loop gain, e = filter error, p = prediction, 
art (k) = running time at the instant (k) on the last day at instant (k) 
artl (k+1) = actual running time of the similar day at instant (k+1) 
VAR [local data] = prediction variance, and 

VAR [local data] = last three similar days in the last three weeks "art3 (k+1), art2 (k+1) and artl (k+1)" variance. 

VAR [local tofc ] = VAR [an } (k 4 1), an 2 + 1), an ? (k + «)] 

The variance VAR [local data] is calculated at each instant k+1 using the actual running time values for last three similar 

days in the last three weeks: artl (k + 1), art2 (k+1) and art3 (k + 1). 

Where: 

artl (k+1): actual running time of the bus at instant (k+1) at the similar day on the previous week. 
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art2 (k+1): actual running time of the bus at instant (k+1) at the similar on two weeks ago. 
art3 (k+1): actual running time of the bus at instant (k+1) at the similar on three weeks ago. 
The definition of the variance for a random variable is: 

VAR[X]=E[(X-E[X])-] 



E(X) = Avg(art) = + ^2^D +«t»tk + 0 

(4 ) 

Now the variance can be calculated as given in the following equations: 



A] = [art] (k+1) - avg(art)]" 
A 2 = [art 2 (k+l) - avg(art)] 2 
A] = [aiij{k+l) - ava(art}] : 



-(5) 



VAR [loci to ] = * l * 2 

VIII. SIMULATION RESULT 

To determine the prediction times of a moving bus to the downstream bus stations, the GPS readings of each 
equipped bus need to be projected onto the underlying transit network. In a digital transit network model, bus routes are 
represented by a sequence of line features as an approximation to their true geographical composition. 

Such straight line approximations are usually not accurate enough for tracking purposes. To ensure representation accuracy. 
The end points of each link, also called nodes, are specified by their longitudes and latitudes. All links and nodes are 
numbered according to the sequence in which the bus passed, and then they are recorded into a file for later use. 
The neural network is learned through the creation of a set of random data for one route consisting of 6 stations from SO to 
S5. Table 2 shows the ranges that were used for creating the random data, where "Sn, n=0,l,2,. . .,5" refers to station number. 



Table 2: The ranges used to create random data 



Day 


sO-sl 


sl-s2 


s2-s3 


s3-s4 


>4-s5 


Sunday 


M 


5->7 


5->7 


8->12 


4->6 


3->5 


A 


7->10 


7->10 


15— >18 


5->8 


4->7 


Monday 


M 


5->7 


5->7 


8->12 


4->6 


3->5 


A 


7->10 


7->10 


15— >18 


5->8 


4->7 


Tuesday 


M 


7->ll 


7— >11 


10->14 


6->8 


5->7 


A 


9->13 


9->13 


17->20 


7->10 


6->9 


Wednesday 


M 


7->ll 


7->ll 


10-->14 


6->8 


5->7 


A 


9->13 


9->13 


17->20 


7->10 


6->9 


Thursday 


M 


5->7 


5->7 


8-->12 


4->6 


3->5 


A 


9->14 


9->14 


18— >21 


8->ll 


7->10 



The simulation was performed using Matlab, Figure 6 shows the result of the proposed route. The simulation results 
give acceptable mean square error in the range of 1.2 minute on the whole route (max. 37min) . 
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Figure 6: Neural network results 



For Kalman filter testing, The example previously used to test the ANN is also applied to test the Kalman filter 
algorithm. Figure 7 shows the Kalman performance versus the actual running time. The simulation results give acceptable 
mean square error in the range of 1 minute on the whole route . 
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Figure 7 Kalman filter algorithm vs. real arrival time 
The comparison between the actual, neural, and Kalman filter results are shown in figure 8. 





Tu esday 



Wedns eday 




Figure 8 : Comparison between the actual, neural, and Kalman filter results 



Figure 8 shows the predicted arrival times at individual bus stations for different time periods and days using neural and 
proposed Kalman filter techniques. As shown in the figure, there is a variation in prediction accuracy with respect to time 
period and stations due to the effect of traffic time along the route segment. In the test scenario the firmware switch between 
two different modes of operation to test the different arrival time calculation algorithms. 

Normal mode operation: In this mode, the expected arrival time of station are selected within acceptable deviations from 
that calculated using ANN algorithm. The firmware loops among the bus stations using the estimated arrival time previously 
selected. The sever sends the estimated arrival time to the stations using the ANN calculated values. 

Congestion mode: In this mode, the expected arrival time of station are selected with wide deviations from that calculated 
using ANN algorithm. The firmware loops among the bus stations using the estimated arrival time previously selected. In 
this case, the server will calculate the estimated arrival time using Kalman filter algorithm. 

IX. CONCLUSIONS 

In this paper, a model-based technique is proposed to predict the expected bus arrival times at individual bus stops 
along a service route. The proposed prediction algorithm combines real-time location data from GPS receivers which built in 
buses with average travel speeds of individual route segments, taking into account historical travel speed as well as temporal 
and spatial variations of traffic conditions. The proposed method is a hybrid scheme that combines the robustness of neural 
network with the reliability of Kalman filter. A case study on a real bus route is conducted to evaluate the performance of the 
proposed algorithm in terms of prediction accuracy. The results indicate that the proposed system is capable of achieving 
satisfactory accuracy in predicting bus arrival times. 
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